Data Mining using Genetic Programming: The Implications of Parsimony on Generalization Error
نویسندگان
چکیده
A common data mining heuristic is, “When choosing between models with the same training error, less complex models should be preferred as they perform better on unseen data.” This heuristic may not always hold. In genetic programming a preference for less complex models is implemented as i) placing a limit on the size of the evolved program ii) penalizing more complex individuals, or both. This paper presents a GPvariant with no limit on the complexity of the evolved program that generates highly accurate models on a common dataset.
منابع مشابه
Modeling of measurement error in refractive index determination of fuel cell using neural network and genetic algorithm
Abstract: In this paper, a method for determination of refractive index in membrane of fuel cell on basis of three-longitudinal-mode laser heterodyne interferometer is presented. The optical path difference between the target and reference paths is fixed and phase shift is then calculated in terms of refractive index shift. The measurement accuracy of this system is limited by nonlinearity erro...
متن کاملBalancing Accuracy and Parsimony in
Genetic programming is distinguished from other evolutionary algorithms in that it uses tree representations of variable size instead of linear strings of xed length. The exible representation scheme is very important because it allows the underlying structure of the data to be discovered automatically. One primary diiculty, however, is that the solutions may grow too big without any improvemen...
متن کاملPrediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks
The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...
متن کاملBalancing Accuracy and Parsimony in Genetic Programming
Genetic programming is distinguished from other evolutionary algorithms in that it uses tree representa tions of variable size instead of linear strings of xed length The exible representation scheme is very important because it allows the underlying structure of the data to be discovered automatically One primary di culty however is that the solutions may grow too big without any improvement o...
متن کاملModeling Ghotour-Chai River’s Rainfall-Runoff process by Genetic Programming
Considering the importance of water and computing the amount of rainfall runoff resulted from precipitation in recent decades, using appropriate methods for predicting the amount of runoff from rainfall date has been really essential. Rainfall-runoff models are used to estimate runoff generated from precipitation in the catchment area. Rainfall-runoff process is totally a non-linear phenomenon....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999